• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) »çÀüÈ®·ü °»½ÅÀ» ¼öÇàÇÏ´Â ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ý
¿µ¹®Á¦¸ñ(English Title) Cross-Entropy Planning with Prior Updates
ÀúÀÚ(Author) ȲÇüÁÖ   À念¼ö   ¹ÚÀ翵   ±è±âÀÀ   HyeongJoo Hwang   Youngsoo Jang   Jaeyoung Park   Kee-Eung Kim  
¿ø¹®¼ö·Ïó(Citation) VOL 47 NO. 01 PP. 0088 ~ 0094 (2020. 01)
Çѱ۳»¿ë
(Korean Abstract)
º» ³í¹®¿¡¼­´Â »çÀüÈ®·ü °»½ÅÀ» ¼öÇàÇÏ´Â ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ý¿¡ °üÇØ ±â¼úÇÑ´Ù. ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ýÀº ½Ç½Ã°£ °èȹ¹ý(online planning)¿¡¼­ ¸¹ÀÌ »ç¿ëÇÏ´Â ¹æ¹ý·ÐÀ¸·Î °¡»óȯ°æÀ¸·ÎºÎÅÍ Ç¥º» (sample)À» ÃßÃâÇÏ°í ÃßÃâµÈ Ç¥º»À¸·ÎºÎÅÍ Æò°¡µÈ °¡Ä¡¸¦ ±â¹ÝÀ¸·Î ÃÖÀûÀÇ Çൿ(action)À» ¼±ÅÃÇÑ´Ù. ±âÁ¸ ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ýÀº ÃÖÀûÈ­ °úÁ¤¿¡¼­ ÀÌÀü¿¡ ¾ò¾îÁø Ž»ö°á°ú¸¦ È°¿ëÇÏÁö ¾Ê°í ¸Å¹ø »õ·Ó°Ô Ž»öÀ» ¼öÇàÇÑ´Ù. µû¶ó¼­ Á¤ÇØÁø ½Ã°£ ³»¿¡ Ž»öÀ» ¼öÇàÇØ¾ß ÇÒ °æ¿ì, µµ´ÞÇÒ ¼ö ÀÖ´Â ¼º´ÉÀÌ Á¦ÇѵǾî ÀÖ´Ù. º» ³í¹®¿¡¼­´Â Çൿ Â÷¿ø¿¡ ´ëÇÑ ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ýÀÇ °á°ú¹°À» È°¿ëÇÏ¿© ÃÖÀûÈ­ °úÁ¤¿¡¼­ÀÇ »çÀüÈ®·üÀ» °»½ÅÇÏ°í, À̸¦ ÅëÇØ Á¡Â÷ ³ôÀº ¼º´ÉÀ» º¸ÀÏ ¼ö ÀÖ´Â ¹æ¹ý·ÐÀ» Á¦¾ÈÇÑ´Ù. ¶ÇÇÑ, ½ÇÇè¿¡¼­´Â ¹°¸® ±â¹Ý °¡»óȯ°æ(OpenAI Gym)¿¡¼­ ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ý°ú ºñ±³¸¦ ÅëÇØ Á¦¾ÈµÈ ¹æ¹ý·ÐÀ» Æò°¡ÇÑ´Ù.
¿µ¹®³»¿ë
(English Abstract)
This paper introduces a method of cross-entropy planning which updates prior probability for planning optimization process. Cross-entropy planning is a popular method in online planning and involves the extraction of samples from a simulation environment and selection of optimal action based on the values of the extracted samples. The performance of the cross-entropy planning is limited due to involvement of optimization processes without usage of previous planning results. We propose a method that updates prior probabilities for the optimization process based on the action sequences acquired from the cross-entropy planning. The proposed method improves the performance of cross-entropy planning with progression of planning epoch. We evaluated the proposed method based on the comparison with the cross-entropy planning in a physical-based simulation (OpenAI Gym) environment
Å°¿öµå(Keyword) ±³Â÷ ¿£Æ®·ÎÇÇ °èȹ¹ý   ½Ç½Ã°£ °èȹ¹ý   °³¹æÇü ·çÇÁ °èȹ¹ý   ¼øÂ÷Àû ÀÇ»ç °áÁ¤   cross-entropy method   online planning   open-loop planning   sequential decision making  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå